---
title: "MultiQueryTextRetriever"
id: multiquerytextretriever
slug: "/multiquerytextretriever"
description: "Retrieves documents using multiple queries in parallel with a text-based Retriever."
---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';

# MultiQueryTextRetriever

Retrieves documents using multiple queries in parallel with a text-based Retriever.

<div className="key-value-table">

|  |  |
| --- | --- |
| **Most common position in a pipeline** | After a [`QueryExpander`](../query/queryexpander.mdx) component, before a [`ChatPromptBuilder`](../builders/chatpromptbuilder.mdx) in RAG pipelines |
| **Mandatory init variables** | `retriever`: A text-based Retriever (such as `InMemoryBM25Retriever`) |
| **Mandatory run variables** | `queries`: A list of query strings |
| **Output variables** | `documents`: A list of retrieved documents sorted by relevance score |
| **API reference** | [Retrievers](/reference/retrievers-api) |
| **GitHub link** | https://github.com/deepset-ai/haystack/blob/main/haystack/components/retrievers/multi_query_text_retriever.py |

</div>

## Overview

`MultiQueryTextRetriever` improves retrieval recall by searching for documents using multiple queries in parallel. It wraps a text-based Retriever (such as `InMemoryBM25Retriever`) and processes multiple query strings simultaneously using a thread pool.

The component:
- Processes queries in parallel for better performance
- Automatically deduplicates results based on document content
- Sorts the final results by relevance score

This Retriever is particularly effective when combined with [`QueryExpander`](../query/queryexpander.mdx), which generates multiple query variations from a single user query. By searching with these variations, you can find documents that use different keywords than the original query.

Use `MultiQueryTextRetriever` when your documents use different words than your users' queries, or when you want to use query expansion with keyword-based search (BM25). Running multiple queries takes more time, but you can speed it up by increasing `max_workers` to run queries in parallel.

:::tip When to use `MultiQueryEmbeddingRetriever` instead

If you need semantic search where meaning matters more than exact keyword matches, use [`MultiQueryEmbeddingRetriever`](multiqueryembeddingretriever.mdx) instead. It works with embedding-based Retrievers and requires a Text Embedder.
:::

### Passing Additional Retriever Parameters

You can pass additional parameters to the underlying Retriever using `retriever_kwargs`:

```python
result = multiquery_retriever.run(
    queries=["renewable energy", "sustainable power"],
    retriever_kwargs={"top_k": 5}
)
```

## Usage

### On its own

In this example, we pass three queries manually to the Retriever: "renewable energy", "geothermal", and "hydropower". The Retriever runs a BM25 search for each query (retrieving up to 2 documents per query), then combines all results, removes duplicates, and sorts them by score.

```python
from haystack import Document
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.retrievers import InMemoryBM25Retriever, MultiQueryTextRetriever

documents = [
    Document(content="Renewable energy is energy that is collected from renewable resources."),
    Document(content="Solar energy is a type of green energy that is harnessed from the sun."),
    Document(content="Wind energy is another type of green energy that is generated by wind turbines."),
    Document(content="Hydropower is a form of renewable energy using the flow of water to generate electricity."),
    Document(content="Geothermal energy is heat that comes from the sub-surface of the earth."),
]

document_store = InMemoryDocumentStore()
document_store.write_documents(documents)

retriever = MultiQueryTextRetriever(
    retriever=InMemoryBM25Retriever(document_store=document_store, top_k=2)
)

results = retriever.run(queries=["renewable energy", "geothermal", "hydropower"])

for doc in results["documents"]:
    print(f"Content: {doc.content}, Score: {doc.score:.4f}")
```

### In a pipeline with QueryExpander

This pipeline takes a single query "sustainable power" and expands it into multiple variations using an LLM (for example: "renewable energy sources", "green electricity", "clean power"). The Retriever then searches for each variation and combines the results. This way, documents about "solar energy" or "hydropower" can be found even though they don't contain the words "sustainable power".

<Tabs>
<TabItem value="python" label="Python" default>

```python
from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.query import QueryExpander
from haystack.components.retrievers import InMemoryBM25Retriever, MultiQueryTextRetriever

documents = [
    Document(content="Renewable energy is energy that is collected from renewable resources."),
    Document(content="Solar energy is a type of green energy that is harnessed from the sun."),
    Document(content="Wind energy is another type of green energy that is generated by wind turbines."),
    Document(content="Hydropower is a form of renewable energy using the flow of water to generate electricity."),
    Document(content="Geothermal energy is heat that comes from the sub-surface of the earth."),
]

document_store = InMemoryDocumentStore()
document_store.write_documents(documents)

pipeline = Pipeline()
pipeline.add_component("query_expander", QueryExpander(n_expansions=3))
pipeline.add_component(
    "retriever",
    MultiQueryTextRetriever(
        retriever=InMemoryBM25Retriever(document_store=document_store, top_k=2)
    )
)
pipeline.connect("query_expander.queries", "retriever.queries")

result = pipeline.run({"query_expander": {"query": "sustainable power"}})

for doc in result["retriever"]["documents"]:
    print(f"Score: {doc.score:.3f} | {doc.content}")
```

</TabItem>
<TabItem value="yaml" label="YAML">

```yaml
components:
  query_expander:
    type: haystack.components.query.query_expander.QueryExpander
    init_parameters:
      n_expansions: 3
  retriever:
    type: haystack.components.retrievers.multi_query_text_retriever.MultiQueryTextRetriever
    init_parameters:
      retriever:
        type: haystack.components.retrievers.in_memory.bm25_retriever.InMemoryBM25Retriever
        init_parameters:
          document_store:
            type: haystack.document_stores.in_memory.document_store.InMemoryDocumentStore
            init_parameters: {}
          top_k: 2

connections:
  - sender: query_expander.queries
    receiver: retriever.queries
```

</TabItem>
</Tabs>

### In a RAG pipeline

This RAG pipeline answers questions using query expansion. When a user asks "What types of energy come from natural sources?", the pipeline:

1. Expands the question into multiple search queries using an LLM
2. Retrieves relevant documents for each query variation
3. Builds a prompt containing the retrieved documents and the original question
4. Sends the prompt to an LLM to generate an answer

The question is sent to both the `query_expander` (for generating search queries) and the `prompt_builder` (for the final prompt to the LLM).

<Tabs>
<TabItem value="python" label="Python" default>

```python
from haystack import Document, Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.query import QueryExpander
from haystack.components.retrievers import InMemoryBM25Retriever, MultiQueryTextRetriever
from haystack.dataclasses import ChatMessage

documents = [
    Document(content="Renewable energy is energy that is collected from renewable resources."),
    Document(content="Solar energy is a type of green energy that is harnessed from the sun."),
    Document(content="Wind energy is another type of green energy that is generated by wind turbines."),
]

document_store = InMemoryDocumentStore()
document_store.write_documents(documents)

prompt_template = [
    ChatMessage.from_system("You are a helpful assistant that answers questions based on the provided documents."),
    ChatMessage.from_user(
        "Given these documents, answer the question.\n"
        "Documents:\n"
        "{% for doc in documents %}"
        "{{ doc.content }}\n"
        "{% endfor %}\n"
        "Question: {{ question }}"
    )
]

# Note: This assumes OPENAI_API_KEY environment variable is set
rag_pipeline = Pipeline()
rag_pipeline.add_component("query_expander", QueryExpander(n_expansions=2))
rag_pipeline.add_component(
    "retriever",
    MultiQueryTextRetriever(
        retriever=InMemoryBM25Retriever(document_store=document_store, top_k=2)
    )
)
rag_pipeline.add_component(
    "prompt_builder",
    ChatPromptBuilder(template=prompt_template, required_variables=["documents", "question"])
)
rag_pipeline.add_component("llm", OpenAIChatGenerator())

rag_pipeline.connect("query_expander.queries", "retriever.queries")
rag_pipeline.connect("retriever.documents", "prompt_builder.documents")
rag_pipeline.connect("prompt_builder.prompt", "llm.messages")

question = "What types of energy come from natural sources?"
result = rag_pipeline.run({
    "query_expander": {"query": question},
    "prompt_builder": {"question": question}
})

print(result["llm"]["replies"][0].text)
```

</TabItem>
<TabItem value="yaml" label="YAML">

```yaml
components:
  query_expander:
    type: haystack.components.query.query_expander.QueryExpander
    init_parameters:
      n_expansions: 2
  retriever:
    type: haystack.components.retrievers.multi_query_text_retriever.MultiQueryTextRetriever
    init_parameters:
      retriever:
        type: haystack.components.retrievers.in_memory.bm25_retriever.InMemoryBM25Retriever
        init_parameters:
          document_store:
            type: haystack.document_stores.in_memory.document_store.InMemoryDocumentStore
            init_parameters: {}
          top_k: 2
  prompt_builder:
    type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
    init_parameters:
      required_variables:
        - documents
        - question
  llm:
    type: haystack.components.generators.chat.openai.OpenAIChatGenerator
    init_parameters: {}

connections:
  - sender: query_expander.queries
    receiver: retriever.queries
  - sender: retriever.documents
    receiver: prompt_builder.documents
  - sender: prompt_builder.prompt
    receiver: llm.messages
```

</TabItem>
</Tabs>
